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Ch . Abstract 

We study the on-line minimum weighted bipartite matching problem in arbi- 
\^ \ trary metric spaces. Here, n not necessary disjoint points of a metric space M 

are given, and are to be matched on-line with n points of M revealed one by one. 
The cost of a matching is the sum of the distances of the matched points, and 
the goal is to find or approximate its minimum. The competitive ratio of the 
deterministic problem is known to be 0(n), see [7, 11]. It was conjectured in [8] 
^ ■ that a randomized algorithm may perform better against an oblivious adversary, 

namely with an expected competitive ratio O(logn). We prove a slightly weaker 
result by showing a o(log^ n) upper bound on the expected competitive ratio. As 
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an application the same upper bound holds for the notoriously hard fire station 
problem, where M is the real line, see [6, 12]. 



Finding a minimum weight matching in a weighted graph G is a well studied problem 
in graph theory. Much less is known about its on-line version; here we briefly introduce 



• the set-up and the most important results. For more thorough references see [7, 8, 

^ ' 11, 12]. 

Let G be an arbitrary weighted graph, and given two players, A and B, we consider 
the following on-line matching game on G: First, A picks the multiset S = {si, . . . , s„} 
of V{G), these are the servers. Then, one by one, A discloses the requests, again a 
multiset R = {ri, . . . ,r„} of V{G). When an element of R is requested, B has to 
match it with some unmatched element from S, and B wishes to minimize the cost 
of the resulted matching. 

It is clear that usually B cannot reach the offline minimum, and the competitive 
ratio, that is the online cost/offline optimum, is infinite if one has no further assump- 
tion on G (see Kalyanasundaram and Pruhs, and Khuller et al in [7, 11]). It was 
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assumed in both papers that the weights are nonnegative, and satisfy the triangle 
inequaUty, so one may refer to the graph G as a metric space M = (X, d) with under- 
lying set X and distance function d, while the multisets S and R are repeated points 
of M. Then the best competitive ratio is exactly 2k — \. This is achieved for iTi^fc, 
the so-called star metric space, where the weights are all ones. 

The randomized setup for the above on-line game is the following: first, A has to 
construct S and R in advance and disclose S. Then A gives the points of R, one by 
one, but this time he has no right to make any changes in the requests, no matter 
how B is playing. That is, not only R but the ordering in which the points of it 
are requested are determined in advance. In this setup B has the advantage of using 
randomness when deciding which point of S to be matched with the newly requested 
point. Let opt(/o) be the total weight of the optimum matching for a sequence of 
requests p. We say that i?'s randomized strategy is c-competitive if for every request 
sequence p 

E[B{p)]<c-ovt{p), 

where E[B{p)] denotes the expected total weight of the matching B finds for p. Find- 
ing good randomized algorithms for the on-line minimum matching problem was first 
addressed by Kalyanasundaram and Pruhs in [8] . They stated that the optimal com- 
petitive ratio for a star metric space is 2iffc — 1, and conjectured an O(logn) upper 
bound on the best competitive ratio for arbitrary metric spaces. Here and later n 
stands for the number of servers (or requests). 
Our goal is to show the following theorem. 

Theorem 1 There is a randomized on-line weighted matching algorithm for arbitrary 
metric spaces which is 0(log^ n/ loglog n) -competitive against an oblivious adversary. 

The strategy of the proof is the following. First we show that it is enough to con- 
sider the case when the metric space M is a finite space, indeed X is the set of servers. 
This will cost only a constant factor of at most 3. Then we develop a randomized 
weighted greedy matching algorithm (RWGM) that has competitive ratio O(logn) 
if the points of M are the leaves of a hierarchically well separated tree, or HST. Here 
the distance d{x, y) is defined by adding up the weights on the edges of the unique 
paths connecting x and y, and the edge weights grow exponentially by the levels of 
the tree. In our case the smallest weights are of size O(logn). In order to use this 
special case, we recall earlier results on probabilistically approximating arbitrary met- 
ric spaces by such trees next. This approximation contributes a 0(log^ n/ loglogn) 
factor to the competitive ratio, so finally we arrive at an algorithm with competitive 
ratio 0(log^ n/ log log n). 

Independently of this work Meycrson, Nanavati and Poplawski [13] exhibited a 
randomized on-line algorithm for the matching problem. They also proved a polylog 
competitive ratio, and used HSTs. 

2 Discretizing the game 

Assume that we have an on-line matching algorithm MA that is c-competitive in 
the possibly infinite metric space M in case R C S (multiplicities allowed). In this 
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subsection we will show that with a small loss in the competitive factor, MA can 
easily be extended to an on-line matching algorithm MAI which works for arbitrary 
S,R C M. The extension of the algorithm is based on a transformation of R which 
we call discretization. 

Given S assume that the elements of R appear one after the other. For r j G i? we 
assign a new point g(ri) G S. We determine g{ri) in a greedy fashion: if d{so,ri) = 
miiis^S d{s , ri) , then g{ri) = sq (breaking ties arbitrarily). Clearly, we can find g{ri) 
on-line. For s € S" denote rm^ the number of requests which are assigned to s by g. 
The new multiset of requests is called R', in which every s G S appears rm^ times. 
R' is the discretized version of R. 

As above, assume that MA is a c-competitive on-line algorithm in the case where 
R G S. Clearly, after the discretization we arrive at an R' such that R' C S. We 
give another on-line algorithm MAI in the following way: we play another, auxiliary 
on-line matching game on M using MA, and use MA^s decisions to determine which 
server MAI chooses to serve a request. Suppose that a request r E R appears. We 
determine g{r), and serve this request using the server returned by MA. If MA 
chooses s e S to serve g{r), then MAI will serve r using s. 

Lemma 2 // MA is c-competitive, then MAI is (2c -|- 1) -competitive for arbitrary 
S,RCM. 

Proof: We start with some more notation. For a matching algorithm A denote A{ri) 
the distance from to s if yl serves this request using s. Denote OM the optimal cost 
matching between S and R, and let opt = cost{OM). OM induces a matching OM' 
(not necessarily of minimum cost) between S and R' in the obvious way: if (rj, Sj) G 
OM, then {g{ri),Sj) G OM'. For an arbitrary matching M, M{ri) = d{ri,Sj) if 
{ri,Sj) G M. Finally, let us denote by opt' the total cost of the minimum matching 
between S and R'. 

From a trivial lower bound on the optimum and by the repeated use of the tri- 
angle inequality we have J2i'=id{ri,g{ri)) < opt. Note that costiOM') > opt' by 
definition. By the triangle inequality MAI{ri) < MA{g{ri)) -\- d{g{ri),ri), hence, 
Er=i MAI{r,) < Er=i MA{g{r,)) + opt. 

Again by the triangle inequality, cost{OM' {g{ri))) < cost{OM{ri)) + d{g{ri),ri) 
for alH = 1, . . . n, that sum up to cost{OM') < cost{OM) + YIi=i d{g{ri),ri). That 
is 

n 

opt' < cost{OM') < cost{OM) + ^d{ri,g{ri)) < cost{OM) + opt < 2opt. 

i=l 

MA is a c-competitive on-line algorithm by assumption, i. e., X^iLi ^ c • opt'. 

We know that MAI{ri) < MA{ri)+opt{ri), therefore, J2i=i MAI{ri) < c-opt' + opt < 
(2c -I- l)opt. □ 

Remark. Lemma 2 gives an alternative proof of the theorem of Kalyanasundaram 
and Pruhs [7], that the competitive ratio of the greedy algorithm is at most 2" — 1. 
Indeed, let MA and MAI be the greedy algorithms for an n — 1 and an n element 
input, respectively, and use induction. 
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3 The algorithm RWGM 



Our algorithm, the randomized weighted greedy matching algorithm, or RWGM 
algorithm is first developed for special metric spaces. Assume that the metric space 
M = (X, d) is defined by a weighted tree T. The set of the leaves of T is L C X, and 
the distance d{x, y) for the leaves x, y is the sum of the weights on the (unique) path 
connecting x and y. Let A > 1 be a real number. 

Definition 1 A X-hierarchically well separated tree (X-HST) is a rooted weighted tree 
with the following properties: 

• the edge weight from any node to each of its children is the same, 

• the edge weights along any path from the root to a leaf are decreasing by the 
factor A from one level to the next. The weight of an edge incident to a leaf is 
one. 

We define the RWGM algorithm first, then show in steps that it is 0(log n)— competitive 
on a metric space determined by a A-HST where A = 2(1 + logn). 

3.1 RWGM: a randomized weighted matching algorithm for hierar- 
chically well sepeirated trees 

Let us consider a A-HST, denote it by T = T{V, E, r), where V is the vertex set, E is 
the edge set of T, and r is the root. When playing the matching game only leaves of 
T will be matched to leaves of T. We denote the set of leaves by L. We will need the 
notion of a subtree: given v E V, the vertex u e V belongs to the subtree Ty if the 
only path from r to u contains v. Clearly, T = Tr, and if w E L, then Tyj contains 
only the leaf w. We have the relation "<" among the subtrees containing a certain 
leaf w: r„ < T^/ if |T„| < |r„/|, w eT^, w € T^/. 

In order to get an easier formulation of RWGM, we assume that if u is a non-leaf 
vertex of a (log n)-HST, then all of its children are non-leaves or all are leaves. This 
can be achieved by inserting "dTimmy" vertices in the tree. We can also assume that 
the edge weights on a level are equal. (See [5].) 

During the course of satisfaction of the requests, certain vertices will be painted 
green, and leaves may have multiplicities. A vertex x is green if the subtree Tx contains 
at least one unassigned server. We need multiplicities since a point (as a server) may 
be listed with multiplicity, and also it may be requested several times. (Recall from 
the introduction that S and R are multisets of V{G).) The colors and multiplicities of 
the vertices may change in time, as we satisfy the requests and using up the servers. 
We try to follow the greedy algorithm, and break ties by random selection by levels. 

Informally, having a request r, we try to assign to r a server s as close as possible 
according to the tree-metric. One visualizes this as going up in T until reaching the 
first green vertex x, and then going down to an unassigned server. However, going 
down from x is unintuitive: we choose uniformly among those edges {x, yi), . . . ,{x, y^) 
that lead to unassigned servers. One is tempted to go down on (x, y) with probabil- 
ity proportional to the number of unassigned servers in Ty. This other approach is 
analyzed in [13]. 
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Formal description of RWGM 

In the beginning the adversary A picks leaves of T with multiphcity, corresponding 
to the servers S = si, . . . ,Sn- (That is if a leaf x is provided m times as a server then 
X has multiplicity m.) 

We color a vertex u oiT green if T„ contains a leaf with positive multiplicity, and 
will call such subtrees green subtrees. 

Then A will give us the requests of R one-by-one, denote them by ri, . . . , r„. 

Set i = 1. 

• Step 1. The new request is rj. B looks for the smallest subtree which contains 
Ti, and u is green. 

• Step 2. Pick a leaf of T„ among the leaves of positive multiplicity by the algo- 
rithm Pick-a-leaf with input u. Let this leaf be x, and let Si (perhaps after 
reordering) be an unused server corresponds to that is matched to r^. Decrease 
the multiplicity of x by one. 

• Step 3. For every green w e V check whether contains a leaf with positive 
multiplicity. If not, erase 'u;'s color. 

• Step 4. If i < n — 1, then set i = i + 1, then go to Step 1. 

• Step 5. If i = ra, then STOP. 

Algorithm Pick-a-leaf(«) 

• Step 1. If the children of u are leaves, then pick randomly, uniformly a leaf 
among those of positive multiplicity. This is the leaf we have chosen. STOP. 

• Step 2. If the children of u are not leaves, then denote ui,U2, ■ ■ ■ ,ut the green 
children of u. Pick one randomly, uniformly among them, say, it is Ui. Apply 
Pick-a-leaf(iii). 

Theorem 3 The algorithm RWGM is O {log n) -competitive on a metric space de- 
termined by a X-HST against an oblivious adversary. 

3.2 Proof of Theorem 3 

We prove Theorem 3 in steps. First we consider the case of uniform metric space 
where the multiplicities are all ones, but the sizes of S and R may not be equal. Then 
we discuss the case where S and R have arbitrary multiplicities. Finally we prove 
the general statement for HST's; here the previous cases provide a basis for induction 
arguments. 
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3.2.1 Uniform case 

In a uniform metric space the distance of two different points is one. It closely 
resembles to the star metric space Ki^k, where the leaves are of a distance two from 
each other. (This explains the extra two factor in some of our later formulas.) 

Assume that U is the uniform metric space on u points. Let S = {si, . . . , Sq} and 
R = {ri, . . . ,rt}, Si 7^ Sj and 7^ rj if i 7^ j. We also assume that the points of R 
are requested in increasing order, first ri, then r2, etc., and finally rj. 

Before dealing with the general case, let us consider a simple but instructive 
example, when l^l = \R\ = q, and these sets share q — I points. Clearly, the worst 
case if the first request ri is not in S. Assigning ri to some Si for which there is 
an rj = Si destroys optimality. This mistake may spread when we match rj. It was 
noted in [8] that any randomized on-line algorithm for that instance has about log q 
expected cost, although the optimal cost is one. This explains why we have to take 
special care of the common points of S and R, and also the order of requests. 

Definition 2 We say that Si E S has a partner if Si = rj for some rj E R. Similarly, 
rj G R has a partner if si = rj for some Si E S. 



Wc will give an ordering of the points of S using the above mentioned ordering 
on R. Firstly if there exist rj and such that Si is the partner of rj and Sfe is the 
partner of r^ where j < £, then < Sk- If Si has a partner and Sk has no partner, 
then Sj < Sk and rj < for all j. Finally, wc fix an arbitrary ordering among those 
points of S which have no partner in R. Notice, that we can extend the ordcrings of 
5" and R into an ordering "<" of S' U i?. This is done such that if is the partner of 
Sj then ri < Sj, and for r^ > ri we have Sj < r^. The points of S having no partner 
go to the end of the ordering. 

Given ri £ Rwe associate a weight w{ri) with it. It is the difference of the number 
of servers following, and the number of requests without partner preceeding r^. Let 
us assume that rj has no partner, then 



Vi 



\{sj : Sj > rj}| — \{rk '■ r^ < ri and has no partner} |. 



If Tj has a partner, then let Vi = 0. Furthermore let Hm = 1 + ^ + • • • + m' that 
is the m^^ Harmonic number. Then we define w{ri) = (we let Hf = if / < 0). 
We need the following useful lemma. 

Lemma 4 Forn>\,Hn = l + . 

Proof: Trivial computation. □ 

Lemma 5 Let 6 = \R — S\. Then in the case above the expected cost o/RWGM is 
at most Hq + + . . . + • 

Proof: We proceed by induction on q that is the size of S. Notice that we may 
assume that ri has no partner, otherwise we can immediately apply the induction 
hypothesis. Now ri is matched to some randomly chosen Sj € S. One can check by 
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the definition of Vi that the weights of tlic elements of \ {ri } are invariant if Sj had 
no partner. If Sj had the partner then the expected new weight of is at most 
+ . . . + Hi) / q. Now by induction one can see that for the resulting smaller 
subproblem the random algorithm has expected cost Hq-i + . . . + Hgs+i- Match of 
ri to Sj costs one, hence, the expected cost of the algorithm is at most 

1 H — + -ffg-l + . . . + Hq_s+l = Hq + Hg^i + . . . + Hg_s^i, 

by Lemma 4. □ 
3.2.2 The case of multiplicities 

We want to handle the case when both the servers and the requests have various 
multiplicities. Note, that a server with zero multiplicity simply means that there 
is no server at that point. If U = xi,...,Xu, then let ms{xi) and mr{xj) are the 
multiplicities of servers and requests in point Xj and Xj, respectively. Let S{xi) = 
max{0, mr{xi) — ms{xi)}, 6 = J2i=i S{xi). 

Lemma 6 The expected cost of RWGM is at most Hq + Hq^i + . . . + Hqs+i ■ 

Proof: Fix a maximum matching between servers at requests which belong to the 
same point. Pretend that the remaining unmatched equal servers/requests are at 
different points, and apply Lemma 5. □ 



3.2.3 General A-HST trees 

We proceed by induction on the height of the A-HST tree. First, we need a more 
technical form of the hypothesis and some definitions. 

Definition 3 Given s G 5 and r E R, which are matched in some matching M , 
consider the path connecting them in the HST tree. Call the point at the highest level 
of this path the turning point of s and r, shortly tM{s, r). For a point u of the tree let 
tm{u) he the number of (s, r) matched pairs in M for which u is a turning point. 

Given a point n, h{u) will denote the height of u. We can express the cost of an 
arbitrary matching M as 

h{u) 

u i=l 

Observe that tm{u) is the same for any optimal matching M, hence in this case 
we suppress the subscript M. Note that r(u) is obvious to compute. Moreover, one 
can express the optimal cost: 

h{u) 

opt = 2^T(u)^A*-i. 

u i=l 

For trees of height less than d our induction hypothesis is the following inequality: 
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h{u) 

E[RWGM] < 2 ^ t{u) J2 CiX\ 

u i=l 

where A = 2(1 + logn), ci = 1/2 and := C(_i + (1/2)* for t > 1. Notice, that since 
Q < 1, the above statement proves that RWGM is 0(log n)-competitive against an 
obUvious adversary implying Theorem 3. 

For trees of height one the statement follows from Lemma 6 and its remark. 
Consider a tree T of height d. Wc make a new tree T' and a new instance S' and R' . T' 
comes from T by pruning the leaves, and for a u G T, h{u) = 1 we associate the server 
and request multiplicities that of the sum of the server and request multiplicities of 
its descendants in T. T* denotes the set of subtrees of T of height one, i. e. the leaves 
and their parents. Note that we have to divide the edge weights of T' by A in order 
to get a A-HST-tree. 

One can cut the optimal cost for S, R and T in two parts. The first part is the 
optimal cost of S', R' and T', which we call opt'. The second part is the cost incurring 
on T*, this is opt*. Here we have to take care of cases when the number of requests 
are greater than the number of servers in a subtree {h{u) = 1). Then we consider 
the partial optimal matching using those servers. Let us call the cost of this partial 
matching, opt* the optimal for this case. 

Clearly, opt* = J2u:h{u)=i ^P^u — ^u:h{u)=i 2t('u) and one concludes that 

opt = A -opt' + 2r(ti)+opt*. 

u:h{u)>2 

Unfortunately, the on-line cost on T is not the sum of the on-line costs of the two 
parts if we handle the parts separately, but they are closely related. 

For this reason we have to take care of the costs occurring in T* when such 
a request is assigned to a leaf of a tree which is not supposed in the optimal 
matching. The exact form of this statement is spelled out in Lemma 7. 

Let be a random matching resulted from the run of RWGM on our tree. Then 
tm (u) is a random variable for each u non-leaf, and M = J^u '''M {u) is a random 
variable again. 

Lemma 7 

h(u) 

E[M]< Yl r(n)^(l + lognr. 

u:h{u)>l i=l 

Proof: We prove Lemma 7 by induction on the height of the tree. It is true for trees 
of height one by Lemma 6. Assuming that the lemma is true for trees of height at 
most h, we will show it for trees of height h+ 1. 

Let T be a A-HST tree of height h + l. We define T' and T* as before. M' is just 
the truncated sum of M on T'. By the induction hypothesis we have the following 
inequality: 
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h(u)-l 

E[M'] < Yl E 

u:h(u)>2 i=l 

Note furthermore that every extra request arriving from T' to a vertex u of height one 
(i. e. to a root of a tree r„ of the forest T*) increases the expected cost of RWGM 
in Tu by at most log n by Lemma 6. 

The average cost on the trees of T* comes from two sources; one is opt*, the other 
is M'. In order to get an upper bound on the cost on T* we have to add them up 
and multiply both by log n, according to the explanation in the previous paragraph. 
This way we have 

E[M] < log n I '^(^) + ^[^'] \ + ^[^'] = 



h{u) h{u) 

Y T{u)logn+ Y r{u)Yi^ + ^ogny < Y ^(7/) ^(1 + logn)*, 

u:h{u)=l u:h{u)>2 i=2 u:h{u)>l i=l 

which proves the lemma. □ 
Now we will use this lemma to prove that 

h{u) 

E[RWGM] < 2^t(u) Y (^i^'- 

U 1=1 

Again we will proceed by induction. Assume that the statement is true for trees 
of height at most h, and consider a tree T of height h + 1. We prune the leaves of T, 
thereby getting T'. Recall, that edge weights in T has to be divided by A so as to get 
that T' is a A-HST. For T' the statement is true by the induction hypothesis. That 
is, the expected cost of RWGM on T' is at most 

E[RWGM(r')] < 2 Y ^(^) E CiA\ 

u:h{u)>2 i=l 

Clearly, if we add this up with the expected cost at level one, we get an upper 
bound for the expected cost of RWGM on T: 

E[RWGM(r)] < A • E[RWGM(r')] + 2 • E[M]. 
We want to show that this is at most 

h{u) 

2Yr{u)YcrX'- 

u i=i 

The coefficient of t{u) in the upper bound is less than '}Z\=i c^A* for every u at level 
(.. For £ = 1, it follows since logn < ci2(l + logn). 
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For ^ > 1, we need to show that 

£-1 i-1 e 

log n ^(1 + log + ^ c, Y+i < ^ . 

i=l i=l 1=1 

It follows if 

e-i e 
logn^(l + logn)* <Y,ici-<H-i)>^' + ciX. 

1=1 i=2 

Since lognX^^~J(l + logn)' = (1 + logn)^ ~ (1 + log?^), it reduces to 

{1 + lognY <J2(-) (2 + 2 log nf = ^(1 + log nf. 

i=i i=i 

□ 

4 Approximating by hierarchically well separated trees 

The first results and applications of hierarchically well separated trees are due to 
Bartal, see in [2, 3]. It generalized the earlier works of Karp [10] and Alon et al [1] in 
which they approximated the distances in certain graphs by using randomly selected 
spanning trees. 

Definition 4 A metric space M = {X,dM) dominates a metric space N = (X, djv) 
if for every x,y E X we have dN{x,y) < dM{x,y). 

Definition 5 A set of metric spaces S over X a-probabilistically approximates a 

metric space M over X, if every metric space in S dominates M , and there exists 
a probability distribution over metric spaces N & S such that for every x,y € X we 
have E[d]\fix,y)] < adMix,y). 

The proof of Theorem 1 is based on the following theorem. 

Theorem 8 [5j Every weighted graph on n vertices can be a-probabilistically approx- 
imated by a set of X-HSTs, for an arbitrary A > 1 where a = 0(Alogn/log A). 

As noted by Bartal [2], having an approximation of a metric space M by HST 
trees along with a good algorithm for such trees always results in a good randomized 
algorithm in that space. So, what wc do is the following. First, preprocessing: given 
the set of servers 6*, these points span a sub- metric space M.s C M.. Clearly, \M.s\ ^ 
n, since 5 is a multiset of n elements. We approximate Ms by a set of 0(log n)-HSTs. 
Plugging in A = 2(1 + logn) into Theorem 8 we get there is a probability distribution 
V on these trees such that the expected distortion is 0(log^ n/ log logn). Choose 
one tree at random according to V. This finishes the preprocessing. Whenever a 
request r E R appears, we determine g{r) (see Section 2), and use RWGM with this 
new request g{r). We proved in Section 3, that RWCM is a 0(logn)-competitive 
algorithm in this case. Applying Lemma 2 and Theorem 8, we get that RWGM is 
0(log^ n/ log logra) competitive for Ai. This proves Theorem 1. □ 
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